To get such a profile, compile your program with -g and link it with -pg and the library libbmon.a. Make sure that the bmon library is linked immediately after your .o files, before any system libraries. Example:
gcc -g -c foo.c gcc -g -c bar.c gcc -pg -o fubar foo.o bar.o -lbmon -lmYou can specify optimization flags. This will give you a better estimate of the program behaviour when it is optimized, but it can make the data hard to interpret as the relation between source and object code can be obscure.
Running the program should produce a file bmon.out in the current directory (that is the current directory of the program at the moment it exits). Now run bprof on your executable. This should produce files foo.c.bprof and bar.c.bprof which are the same as the corresponding source files, but have the time spent in each line prepended to that line. A '-' in the output indicates a line where no measurable time was spent, while a '.' indicates a line that is not executable at all.
Time spent is measured in clock ticks, so in 1/100 of a second. As timing statistics are taken stochastically by recording the instruction each clock tick, these numbers are not exact and can vary from one run to another.
The first non-option argument given will be taken as the name of the executable (default: a.out). Any further arguments are interpreted as the names of bmon.out files (default: bmon.out). If you specify more such files bprof takes the sum of the timings of these files.
Note that in any case source files will only be looked for by the full path name they had at the moment of compilation. If you have moved them around since compilation, you need to recompile, move them back or create a symlink somewhere.
Profiling will not be done while your program is blocking the VTALRM signal.